Libraries and Packages


In [1]:
import pymongo
from pymongo import MongoClient
import time
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import csv

Connecting to National Data Service: The Lab Benchwork's NBI - MongoDB instance


In [2]:
Client = MongoClient("mongodb://bridges:readonly@nbi-mongo.admin/bridge")
db = Client.bridge
collection = db["bridges"]

Deterioration curves of Midwestern United states

For demonstration purposes, the results only focuses on the states in the Midwestern United States which includes: Illinois, Indiana, Iowa, Kansas, Michigan, Minnesota, Missouri, Nebraska, North Dakota, Ohio, South Dakota, and Wisconsin.

The classification of the bridge into slow deteriorating, fast deteriorating, and average deteriorating is done based on bridge's rate of deterioration. Therefore, In this section will demonstrate how bridges deteriorate over time in the Midwestern United States. To plot the deterioration curve of bridges in every state of Midwestern United States, bridges were grouped by their age. As a result, There are 60 groups of bridges from age 1 to 60, The mean of the condition rating of the deck, superstructure, and substructure of the bridge is plotted for every age.

Extracting Data of Midwestern states of the United states from 1992 - 2016.

The following query will extract data from the mongoDB instance and project only selected attributes such as structure number, yearBuilt, deck, year, superstructure, and subtructure.


In [ ]:
def getData(state):
    pipeline = [{"$match":{"$and":[{"year":{"$gt":2015, "$lt":2017}},{"stateCode":state}]}},
            {"$project":{"_id":0,
                         "stateCode":1,
                         "structureNumber":1,
                         "yearBuilt":1,
                         "deck":1, ## rating of deck
                         "year":1, ## survey year
                         "substructure":1, ## rating of substructure
                         "superstructure":1, ## rating of superstructure
                         }}]
    
    dec = collection.aggregate(pipeline)
    conditionRatings = pd.DataFrame(list(dec))  
    conditionRatings['Age'] = conditionRatings['year'] - conditionRatings['yearBuilt']
    return conditionRatings

Filtering Null Values, Converting JSON format to Dataframes, and Calculating Mean Condition Ratings of Deck, Superstructure, and Substucture

After NBI data is extracted. The Data has to be filtered to remove data points with missing values such as 'N', 'NA'. The mean condition rating for all the components: Deck, Substructure, and Superstructe, has to be calculated.


In [4]:
def getMeanRatings(state,startAge, endAge, startYear, endYear):
    conditionRatings = getData(state)
    conditionRatings = conditionRatings[['structureNumber','Age','superstructure','deck','substructure','year']]
    conditionRatings = conditionRatings.loc[~conditionRatings['superstructure'].isin(['N','NA'])]
    conditionRatings = conditionRatings.loc[~conditionRatings['substructure'].isin(['N','NA'])]
    conditionRatings = conditionRatings.loc[~conditionRatings['deck'].isin(['N','NA'])]
    #conditionRatings = conditionRatings.loc[~conditionRatings['Structure Type'].isin([19])]
    #conditionRatings = conditionRatings.loc[~conditionRatings['Type of Wearing Surface'].isin(['6'])]
    
    maxAge = conditionRatings['Age'].unique()
    tempConditionRatingsDataFrame = conditionRatings.loc[conditionRatings['year'].isin([i for i in range(startYear, endYear+1, 1)])]
    
    MeanDeck = []
    StdDeck = []
    
    MeanSubstructure = []
    StdSubstructure = []
    
    MeanSuperstructure = []
    StdSuperstructure = []
    
    ## start point of the age to be = 1 and ending point = 100
    for age in range(startAge,endAge+1,1):
        ## Select all the bridges from with age = i
        tempAgeDf = tempConditionRatingsDataFrame.loc[tempConditionRatingsDataFrame['Age'] == age]
        
        ## type conversion deck rating into int
        listOfMeanDeckOfAge = list(tempAgeDf['deck'])
        listOfMeanDeckOfAge = [ int(deck) for deck in listOfMeanDeckOfAge ] 
        
        ## takeing mean and standard deviation of deck rating at age i
        meanDeck = np.mean(listOfMeanDeckOfAge)
        stdDeck = np.std(listOfMeanDeckOfAge)
        
        ## type conversion substructure rating into int
        listOfMeanSubstructureOfAge = list(tempAgeDf['substructure'])
        listOfMeanSubstructureOfAge = [ int(substructure) for substructure in listOfMeanSubstructureOfAge ] 
        
        meanSub = np.mean(listOfMeanSubstructureOfAge)
        stdSub = np.std(listOfMeanSubstructureOfAge)
        
        
        ## type conversion substructure rating into int
        listOfMeanSuperstructureOfAge = list(tempAgeDf['superstructure'])
        listOfMeanSuperstructureOfAge = [ int(superstructure) for superstructure in listOfMeanSuperstructureOfAge ] 
        
        meanSup = np.mean(listOfMeanSuperstructureOfAge)
        stdSup = np.std(listOfMeanSuperstructureOfAge)
      
        #Append Deck
        MeanDeck.append(meanDeck)
        StdDeck.append(stdDeck)
        
        #Append Substructure
        MeanSubstructure.append(meanSub)
        StdSubstructure.append(stdSub)
        
        #Append Superstructure
        MeanSuperstructure.append(meanSup)
        StdSuperstructure.append(stdSup)
        
    return [MeanDeck, StdDeck ,MeanSubstructure, StdSubstructure, MeanSuperstructure, StdSuperstructure]

Creating DataFrames of the Mean condition ratings of the deck, superstructure and substructure

The calculated Mean Condition Ratings of deck, superstructure, and substructure are now stored in seperate dataframe for the convience.


In [ ]:
states = ['31','19','17','18','20','26','27','29','38','46','39','55']

# state code to state abbreviation 
stateNameDict = {'25':'MA',
                 '04':'AZ',
                 '08':'CO',
                 '38':'ND',
                 '09':'CT',
                 '19':'IA',
                 '26':'MI',
                 '48':'TX',
                 '35':'NM',
                 '17':'IL',
                 '51':'VA',
                 '23':'ME',
                 '16':'ID',
                 '36':'NY',
                 '56':'WY',
                 '29':'MO',
                 '39':'OH',
                 '28':'MS',
                 '11':'DC',
                 '21':'KY',
                 '18':'IN',
                 '06':'CA',
                 '47':'TN',
                 '12':'FL',
                 '24':'MD',
                 '34':'NJ',
                 '46':'SD',
                 '13':'GA',
                 '55':'WI',
                 '30':'MT',
                 '54':'WV',
                 '15':'HI',
                 '32':'NV',
                 '37':'NC',
                 '10':'DE',
                 '33':'NH',
                 '44':'RI',
                 '50':'VT',
                 '42':'PA',
                 '05':'AR',
                 '20':'KS',
                 '45':'SC',
                 '22':'LA',
                 '40':'OK',
                 '72':'PR',
                 '41':'OR',
                 '27':'MN',
                 '53':'WA',
                 '01':'AL',
                 '31':'NE',
                 '02':'AK',
                 '49':'UT'
               }

def getBulkMeanRatings(states, stateNameDict):
    # Initializaing the dataframes for deck, superstructure and subtructure
    df_mean_deck = pd.DataFrame({'Age':range(1,61)})
    df_mean_sup = pd.DataFrame({'Age':range(1,61)})
    df_mean_sub = pd.DataFrame({'Age':range(1,61)})
    
    df_std_deck = pd.DataFrame({'Age':range(1,61)})
    df_std_sup = pd.DataFrame({'Age':range(1,61)})
    df_std_sub = pd.DataFrame({'Age':range(1,61)})

    for state in states:
        meanDeck, stdDeck, meanSub, stdSub, meanSup, stdSup = getMeanRatings(state,1,100,1992,2016)
        stateName = stateNameDict[state]
        print('State Name: ',stateName)
        df_mean_deck[stateName] = meanDeck[:60]
        df_mean_sup[stateName] = meanSup[:60]
        df_mean_sub[stateName] = meanSub[:60]
        
        df_std_deck[stateName] = stdDeck[:60]
        df_std_sup[stateName] = stdSup[:60]
        df_std_sub[stateName] = stdSub[:60]
        
    return df_mean_deck, df_mean_sup, df_mean_sub, df_std_deck, df_std_sup, df_std_sub
    
df_mean_deck, df_mean_sup, df_mean_sub, df_std_deck, df_std_sup, df_std_sub = getBulkMeanRatings(states, stateNameDict)

In [ ]:
df_mean_deck.to_csv['deckmidwestern.csv']

Deterioration Curves - Deck

The figure below the deterioration curve of mean deck rating of twelve midwestern states in united states. All of the states have similar deterioration curves. Most of the bridges when first built are given a condition rating of 8 or 9. Michigan's (MI) deterioration curve is anomaly as it starts with a mean condition rating of approximately 7.2.

The following charts presents two different perspective to look at deterioration of bridges in midwestern united states:

  1. Deterioration Curves of all the states in Midwestern United States together, to have a correct idea of how these deterioration curves relate to each other.
  2. Individual Deterioration curves of all the states in Midwestern United States.

In [9]:
%matplotlib inline
states = ['31','19','17','18','20','26','27','29','38','46','39','55']
palette = [ 'blue', 'blue', 'green', 'magenta', 'cyan', 'brown', 'grey',
    'red','silver','purple', 'gold', 'black','olive' ]

plt.figure(figsize = (10,8))
index = 0
for state in states:
    index = index + 1
    stateName = stateNameDict[state]
    plt.plot(df_mean_deck['Age'],df_mean_deck[stateName], color = palette[index])
plt.legend([stateNameDict[state] for state in states],loc='upper right', ncol = 2) 
plt.xlim(1,60)
plt.ylim(1,9)
plt.title('Mean Deck Rating Vs Age')
plt.xlabel('Age')
plt.ylabel('Mean Deck Rating')


Out[9]:
<matplotlib.text.Text at 0x7f1881dc5f28>

In [107]:
plt.figure(figsize = (16,12))
plt.xlabel('Age')
plt.ylabel('Mean')


# Initialize the figure
plt.style.use('seaborn-darkgrid')
 
# create a color palette
#palette = plt.get_cmap('gist_ncar')
palette = [
    'blue', 'blue', 'green','magenta','cyan','brown','grey','red','silver','purple','gold','black','olive'
]
# multiple line plot
num=1
for column in df_mean_deck.drop('Age', axis=1):
    
    # Find the right spot on the plot
    plt.subplot(4,3, num)
 
    # Plot the lineplot
    plt.plot(df_mean_deck['Age'], df_mean_deck[column], marker='', color=palette[num], linewidth=4, alpha=0.9, label=column)
 
    # Same limits for everybody!
    plt.xlim(1,60)
    plt.ylim(1,9)
 
    # Not ticks everywhere
    if num in range(10) :
        plt.tick_params(labelbottom='off')
    if num not in [1,4,7,10]:
        plt.tick_params(labelleft='off')
 
    # Add title
    plt.title(column, loc='left', fontsize=12, fontweight=0, color=palette[num])
    plt.text(30, -1, 'Age', ha='center', va='center')
    plt.text(1, 4, 'Mean Deck Rating', ha='center', va='center', rotation='vertical')
    num = num + 1
 
# general title
plt.suptitle("Mean Deck Rating vs  Age \nIndividual State Deterioration Curves", fontsize=13, fontweight=0, color='black', style='italic', y=1.02)


Out[107]:
<matplotlib.text.Text at 0x7f05c56edeb8>

In [25]:


In [16]:
df_mean_deck


Out[16]:
Age NE IA IL IN KS MI MN MO ND SD OH WI
0 1 8.863208 8.902138 8.818163 8.330150 8.214320 7.192911 8.860465 8.882924 8.877660 7.736185 8.888453 8.554825
1 2 8.879425 8.767092 8.748394 8.223964 8.160431 8.182007 8.681173 8.780413 8.750000 7.659722 8.731419 8.416721
2 3 8.584222 8.618573 8.502533 8.087523 8.071550 8.239312 8.463983 8.660272 8.593182 7.514650 8.579071 8.112893
3 4 8.454126 8.495514 8.390409 7.922415 7.990483 8.121513 8.293946 8.538644 8.448498 7.484069 8.453925 7.990915
4 5 8.316667 8.382911 8.331194 7.811000 7.921513 8.017623 8.135691 8.431316 8.341615 7.454118 8.333293 7.867805
5 6 8.220605 8.278469 8.253328 7.696310 7.858413 7.949846 8.039809 8.329422 8.260274 7.435466 8.214577 7.792788
6 7 8.106157 8.161430 8.150652 7.608547 7.808676 7.863598 7.945776 8.225009 8.194444 7.381738 8.117849 7.705699
7 8 8.045325 8.058335 8.088053 7.513095 7.752656 7.781766 7.869038 8.122709 8.135802 7.324034 8.031449 7.635449
8 9 7.955871 7.968518 8.035218 7.421488 7.707713 7.713703 7.791699 8.025108 8.079935 7.295643 7.951960 7.576505
9 10 7.892497 7.883307 7.977058 7.344177 7.661010 7.661626 7.724774 7.916602 8.015898 7.228940 7.880749 7.529116
10 11 7.809362 7.802873 7.922128 7.276777 7.618498 7.590278 7.682794 7.802867 7.972603 7.177866 7.815012 7.464926
11 12 7.766090 7.732319 7.884914 7.203431 7.565363 7.540845 7.646100 7.712785 7.902098 7.144956 7.746450 7.420000
12 13 7.690751 7.654193 7.840785 7.141721 7.522655 7.485172 7.632309 7.624137 7.840000 7.089021 7.672096 7.376759
13 14 7.648349 7.583847 7.796394 7.078546 7.482316 7.437107 7.608221 7.559926 7.806250 7.086915 7.602988 7.335245
14 15 7.585601 7.525132 7.743365 7.032206 7.450841 7.389042 7.581172 7.511392 7.745498 7.053073 7.543422 7.280625
15 16 7.545785 7.459634 7.698190 6.977521 7.395753 7.346875 7.536835 7.449876 7.713956 7.017383 7.474123 7.241480
16 17 7.438588 7.402080 7.653076 6.926564 7.356295 7.278081 7.487439 7.389860 7.637698 6.993794 7.412011 7.176925
17 18 7.375148 7.348329 7.591970 6.891917 7.304819 7.211612 7.447079 7.339330 7.595349 6.992580 7.336897 7.121042
18 19 7.310419 7.303433 7.543432 6.843096 7.258263 7.161732 7.419140 7.270651 7.533482 6.965436 7.275421 7.081457
19 20 7.263346 7.247742 7.487701 6.798946 7.223570 7.094990 7.380174 7.214703 7.476562 6.929067 7.198620 7.059318
20 21 7.196981 7.189455 7.444231 6.778665 7.182267 7.004571 7.338805 7.159355 7.407285 6.881432 7.131055 6.998915
21 22 7.132460 7.136324 7.391015 6.734905 7.144637 6.905224 7.293758 7.094841 7.350158 6.870790 7.055461 6.954043
22 23 7.056388 7.093686 7.342733 6.694288 7.090430 6.816677 7.251453 7.036378 7.324769 6.815574 6.983270 6.911680
23 24 7.012990 7.041891 7.285021 6.664146 7.051613 6.766862 7.209739 6.966348 7.264334 6.798147 6.909898 6.818044
24 25 6.971287 6.983815 7.224873 6.637320 7.015568 6.692729 7.160511 6.914157 7.220767 6.731487 6.850459 6.757490
25 26 6.936968 6.925438 7.163129 6.615840 6.991681 6.636845 7.111634 6.861614 7.149499 6.684145 6.795712 6.701666
26 27 6.894633 6.883452 7.103577 6.579165 6.952618 6.590526 7.091599 6.794062 7.068602 6.651246 6.753337 6.622203
27 28 6.865848 6.832758 7.033110 6.564860 6.920447 6.509351 7.053750 6.712476 7.049037 6.616828 6.705723 6.589761
28 29 6.841152 6.791448 6.970704 6.521692 6.882968 6.456015 7.017857 6.640224 6.998236 6.558356 6.667449 6.549944
29 30 6.787506 6.751763 6.917007 6.484256 6.858407 6.384038 6.989778 6.576117 6.984916 6.490831 6.620456 6.511685
30 31 6.732779 6.708517 6.866748 6.476251 6.823628 6.353434 6.937108 6.510134 6.948879 6.460733 6.586225 6.467521
31 32 6.676843 6.663686 6.822165 6.460437 6.765131 6.298843 6.911831 6.470787 6.857385 6.461425 6.550743 6.427530
32 33 6.650106 6.622601 6.775159 6.451547 6.740287 6.259512 6.858194 6.414243 6.809367 6.449165 6.535140 6.415020
33 34 6.622652 6.562699 6.694218 6.436199 6.723592 6.247158 6.812273 6.340944 6.763955 6.414103 6.517025 6.397835
34 35 6.579065 6.518277 6.655266 6.412329 6.695836 6.231977 6.786738 6.269007 6.741365 6.374111 6.492434 6.354588
35 36 6.567043 6.457273 6.629586 6.414446 6.671413 6.215965 6.767971 6.207802 6.711365 6.336312 6.484475 6.318624
36 37 6.536067 6.427139 6.612512 6.401184 6.636025 6.213948 6.731593 6.183430 6.622990 6.303809 6.471530 6.298531
37 38 6.503614 6.390600 6.599482 6.394035 6.615300 6.194476 6.697956 6.126843 6.615258 6.260274 6.500621 6.280038
38 39 6.483981 6.345455 6.545641 6.381292 6.602065 6.186063 6.647754 6.077058 6.598354 6.241444 6.486915 6.280106
39 40 6.480857 6.300208 6.516760 6.374962 6.570992 6.191011 6.597564 6.026643 6.580216 6.223923 6.476761 6.270143
40 41 6.444556 6.261368 6.505641 6.357636 6.551741 6.196655 6.551820 5.992713 6.568220 6.194336 6.475073 6.253446
41 42 6.437139 6.236854 6.487805 6.322066 6.532365 6.174039 6.526550 6.040558 6.489461 6.146856 6.470383 6.247957
42 43 6.422460 6.209867 6.476218 6.305056 6.492932 6.136014 6.518656 6.026424 6.478397 6.122646 6.461364 6.181866
43 44 6.407703 6.176172 6.452438 6.277856 6.483241 6.144133 6.484236 6.005513 6.443018 6.117227 6.462271 6.174907
44 45 6.410656 6.138382 6.442664 6.253915 6.453347 6.157020 6.447328 5.974575 6.440000 6.105652 6.459606 6.157322
45 46 6.377551 6.098998 6.406424 6.206743 6.416711 6.141787 6.393090 5.950362 6.429745 6.074385 6.445516 6.159453
46 47 6.379029 6.051260 6.399931 6.197832 6.367886 6.141720 6.356672 5.990776 6.400335 6.091706 6.432071 6.150932
47 48 6.319880 6.003089 6.384320 6.219572 6.347753 6.161587 6.314675 5.966240 6.381803 6.065415 6.430178 6.142177
48 49 6.269877 5.962029 6.365164 6.200097 6.331318 6.144574 6.273515 5.939365 6.342883 6.021167 6.437113 6.129334
49 50 6.251934 5.920160 6.379628 6.179035 6.327614 6.129565 6.217372 5.928678 6.320667 5.981132 6.455874 6.096197
50 51 6.219072 5.889483 6.355150 6.151984 6.295551 6.096524 6.210152 5.921760 6.307692 5.957103 6.453324 6.034921
51 52 6.166154 5.903594 6.346234 6.114666 6.292293 6.096572 6.197901 6.038619 6.261261 5.978697 6.485026 6.028276
52 53 6.106314 5.872575 6.349555 6.104736 6.261952 6.042646 6.201592 6.029979 6.236915 5.969986 6.473438 5.995417
53 54 6.057310 5.855579 6.331558 6.114988 6.250512 6.027580 6.179188 6.033949 6.194043 5.960087 6.462164 6.018828
54 55 6.007985 5.837022 6.328810 6.105134 6.246756 6.018657 6.152161 6.008912 6.196007 5.958365 6.445981 6.022242
55 56 6.001289 5.799967 6.331024 6.096133 6.198033 6.004180 6.133333 6.018822 6.158992 5.971564 6.432085 6.000948
56 57 6.061284 5.767783 6.315756 6.085325 6.189698 5.981387 6.109443 5.991489 6.073386 6.036111 6.406723 5.957157
57 58 6.020807 5.763060 6.322669 6.054430 6.136216 6.000598 6.078231 5.970484 6.032772 6.023055 6.372250 5.955533
58 59 5.985519 5.725334 6.269060 6.051827 6.114170 6.021696 6.083333 5.929034 5.970120 5.947900 6.353141 5.957669
59 60 5.953795 5.720416 6.249478 6.026549 6.111450 6.016427 6.067504 5.905030 5.954407 5.910165 6.325772 5.957103

Deterioration Curve - Superstructure

The figure below the deterioration curve of mean Superstructure rating of twelve midwestern states in united states. All of the states have similar deterioration for all superstructure, substructure, and deck curves. This provides a hint that the components of the bridge such as superstructure, subtructure, and deck are highly correlated. Most of the bridges when first built are given a condition rating of 8 or 9. Michigan's (MI) deterioration curve is anomaly as it starts with a mean condition rating of approximately 7.2.

The following charts presents similar perspective to look at deterioration of bridges in midwestern united states, like we have looked into deterioration deck.


In [19]:
states = ['31','19','17','18','20','26','27','29','38','46','39','55']
palette = [ 'blue', 'blue', 'green', 'magenta', 'cyan', 'brown', 'grey',
    'red','silver','purple', 'gold', 'black','olive' ]

plt.figure(figsize = (10,8))
index = 0
for state in states:
    index = index + 1
    stateName = stateNameDict[state]
    plt.plot(df_mean_sup['Age'],df_mean_sup[stateName], color = palette[index])
plt.legend([stateNameDict[state] for state in states],loc='upper right', ncol = 2) 
plt.xlim(1,60)
plt.ylim(1,9)
plt.title('Mean Superstructure Rating Vs Age')
plt.xlabel('Age')
plt.ylabel('Mean Superstructure Rating')


Out[19]:
<matplotlib.text.Text at 0x7f05ab5d3c50>

In [106]:
plt.figure(figsize = (16,12))
plt.xlabel('Age')
plt.ylabel('Mean')


# Initialize the figure
plt.style.use('seaborn-darkgrid')
 
# create a color palette
#palette = plt.get_cmap('gist_ncar')
palette = [
    'blue',
    'blue',
    'green',
    'magenta',
    'cyan',
    'brown',
    'grey',
    'red',
    'silver',
    'purple',
    'gold',
    'black',
    'olive'
]
# multiple line plot
num=1
for column in df_mean_sup.drop('Age', axis=1):
    
    # Find the right spot on the plot
    plt.subplot(4,3, num)
 
    # Plot the lineplot
    plt.plot(df_mean_sup['Age'], df_mean_sup[column], marker='', color=palette[num], linewidth=4, alpha=0.9, label=column)
 
    # Same limits for everybody!
    plt.xlim(1,60)
    plt.ylim(1,9)
 
    # Not ticks everywhere
    if num in range(10) :
        plt.tick_params(labelbottom='off')
    if num not in [1,4,7,10]:
        plt.tick_params(labelleft='off')
 
    # Add title
    plt.title(column, loc='left', fontsize=12, fontweight=0, color=palette[num])
    plt.text(30, -1, 'Age', ha='center', va='center')
    plt.text(1, 4, 'Mean Superstructure Rating', ha='center', va='center', rotation='vertical')
    num = num + 1
 
# general title
plt.suptitle("Mean Superstructure Rating vs  Age \nIndividual State Deterioration Curves", fontsize=13, fontweight=0, color='black', style='italic', y=1.02)


Out[106]:
<matplotlib.text.Text at 0x7f05c5305ef0>

Deterioration Curves - Substructure


In [42]:
states = ['31','19','17','18','20','26','27','29','38','46','39','55']
palette = [ 'blue', 'blue', 'green', 'magenta', 'cyan', 'brown', 'grey',
    'red','silver','purple', 'gold', 'black','olive' ]

plt.figure(figsize = (10,8))
index = 0
for state in states:
    index = index + 1
    stateName = stateNameDict[state]
    plt.plot(df_mean_sub['Age'],df_mean_sub[stateName], color = palette[index], linewidth=4)
plt.legend([stateNameDict[state] for state in states],loc='upper right', ncol = 2) 
plt.xlim(1,60)
plt.ylim(1,9)
plt.title('Mean Substructure Rating Vs Age')
plt.xlabel('Age')
plt.ylabel('Mean Substructure Rating')


Out[42]:
<matplotlib.text.Text at 0x7f05c5256c88>

In [43]:
plt.figure(figsize = (16,12))
plt.xlabel('Age')
plt.ylabel('Mean')


# Initialize the figure
plt.style.use('seaborn-darkgrid')
 
# create a color palette
palette = [
    'blue', 'blue', 'green', 'magenta', 'cyan', 'brown', 'grey', 'red', 'silver', 'purple', 'gold', 'black','olive'
]
# multiple line plot
num=1
for column in df_mean_sub.drop('Age', axis=1):
    
    # Find the right spot on the plot
    plt.subplot(4,3, num)
 
    # Plot the lineplot
    plt.plot(df_mean_sub['Age'], df_mean_sub[column], marker='', color=palette[num], linewidth=4, alpha=0.9, label=column)
 
    # Same limits for everybody!
    plt.xlim(1,60)
    plt.ylim(1,9)
 
    # Not ticks everywhere
    if num in range(7) :
        plt.tick_params(labelbottom='off')
    if num not in [1,4,7] :
        plt.tick_params(labelleft='off')
 
    # Add title
    plt.title(column, loc='left', fontsize=12, fontweight=0, color=palette[num])
    plt.text(30, -1, 'Age', ha='center', va='center')
    plt.text(1, 4, 'Mean Substructure Rating', ha='center', va='center', rotation='vertical')
    num = num + 1
 
# general title
plt.suptitle("Mean Substructure Rating vs  Age \nIndividual State Deterioration Curves", fontsize=13, fontweight=0, color='black', style='italic', y=1.02)


Out[43]:
<matplotlib.text.Text at 0x7f05b4ae44a8>

In [84]:
def getDataOneYear(state):
    pipeline = [{"$match":{"$and":[{"year":{"$gt":2015, "$lt":2017}},{"stateCode":state}]}},
            {"$project":{"_id":0,
                         "structureNumber":1,
                         "yearBuilt":1,
                         "deck":1, ## rating of deck
                         "year":1, ## survey year
                         "substructure":1, ## rating of substructure
                         "superstructure":1, ## rating of superstructure
                         }}]
    
    dec = collection.aggregate(pipeline)
    conditionRatings = pd.DataFrame(list(dec))  
    conditionRatings['Age'] = conditionRatings['year'] - conditionRatings['yearBuilt']
    
    
    return conditionRatings

The mean deterioration curve can be a measure to evaluate the rate of deterioration. If the condition rating of a bridge lies above the deterioration curve then the bridge is deteriorating at a slower pace than mean deterioration of the bridges, and if the condition rating of the bridge lies below the deterioration curve of the bridges then it is deteriorating at a faster pace than the mean deterioration of the bridges.

This concept can further be extended to calculate deterioration score. Deterioration score denotes the rate of deterioration. A positive deterioration denotes that the individual bridge is deteriorating at a slower pace than the mean rate of deterioration of bridges, and a negative deterioration denotes that the individual bridge is deteriorating at a higher pace than the mean deterioration of the bridges.

The following provides definition of deterioration score:

Classification Criteria

The classfication criteria used to classify bridges into slow Deterioration, average deterioration and fast deterioration. Bridges are classified based on how far an individual bridge’s deterioration score is from the mean deterioration score.

Categories Value
Slow Deterioration $z_ia$ ​ ≥ $\bar x_a$ ​ + 1 σ ( $ x_a$ )​
Average Deterioration $\bar x_a$ ​ - 1 σ ( $x_a$ )​ ≥ $z_ia$ ≥ $\bar x_a$ ​ + 1 σ ( $ x_a$ )​
Fast Deterioration $z_ia$ ​ ≤ $\bar x_a$ ​ - 1 σ ( $ x_a$ )​

In [103]:
stat = ['31','19','17','18','20','26','27','29','38','46','39','55']  
AgeList = list(conditionRatings['Age'])
deckList = list(conditionRatings['deck'])
num = 1
for st in stat:
    deckR = []
    deckR = getDataOneYear(st)
    deckR = deckR[['Age','deck']]
    deckR= deckR.loc[~deckR['deck'].isin(['N','NA'])]
    stateName = stateNameDict[st]
    labels = []
    for deckRating, Age in zip (deckList,AgeList):
        if Age < 60:
            mean_age_conditionRating = df_mean_deck[stateName][Age]
            std_age_conditionRating = df_std_deck[stateName][Age]

            detScore = (int(deckRating) - mean_age_conditionRating) / std_age_conditionRating

            if (mean_age_conditionRating - std_age_conditionRating) < int(deckRating) <= (mean_age_conditionRating + std_age_conditionRating):
                   # Append a label
                labels.append('Average Deterioration')
                # else, if more than a value,
            elif int(deckRating) > (mean_age_conditionRating + std_age_conditionRating):
                   # Append a label
                labels.append('Slow Deterioration')
                # else, if more than a value,
            elif int(deckRating) < (mean_age_conditionRating - std_age_conditionRating):
                   # Append a label
                labels.append('Fast Deterioration')
            else:
                labels.append('Null Value')
    D = dict((x,labels.count(x)) for x in set(labels))
   
    plt.figure(figsize=(12,6))
    plt.title(stateName)
    plt.bar(range(len(D)), list(D.values()), align='center')
    plt.xticks(range(len(D)), list(D.keys()))
    plt.xlabel('Categories')
    plt.ylabel('Number of Bridges')
    plt.show()
    num = num + 1



In [ ]:
stat = ['31','19','17','18','20','26','27','29','38','46','39','55']  
AgeList = list(conditionRatings['Age'])
deckList = list(conditionRatings['deck'])
num = 1
label = []
for st in stat:
    deckR = []
    deckR = getDataOneYear(st)
    deckR = deckR[['Age','deck']]
    deckR= deckR.loc[~deckR['deck'].isin(['N','NA'])]
    stateName = stateNameDict[st]
    
    for deckRating, Age in zip (deckList,AgeList):
        if Age < 60:
            mean_age_conditionRating = df_mean_deck[stateName][Age]
            std_age_conditionRating = df_std_deck[stateName][Age]

            detScore = (int(deckRating) - mean_age_conditionRating) / std_age_conditionRating

            if (mean_age_conditionRating - std_age_conditionRating) < int(deckRating) <= (mean_age_conditionRating + std_age_conditionRating):
                   # Append a label
                labels.append('Average Deterioration')
                # else, if more than a value,
            elif int(deckRating) > (mean_age_conditionRating + std_age_conditionRating):
                   # Append a label
                labels.append('Slow Deterioration')
                # else, if more than a value,
            elif int(deckRating) < (mean_age_conditionRating - std_age_conditionRating):
                   # Append a label
                labels.append('Fast Deterioration')
            else:
                labels.append('Null Value')

Classification of all the bridges in the Midwestern United States


In [100]:
D = dict((x,labels.count(x)) for x in set(labels))
plt.figure(figsize=(12,6))
plt.title('Classification of Bridges in Midwestern United States')
plt.bar(range(len(D)), list(D.values()), align='center')
plt.xticks(range(len(D)), list(D.keys()))
plt.xlabel('Categories of Bridges')
plt.ylabel('Number of Bridges')
plt.show()



In [7]:
st = getData('31')

In [8]:
st


Out[8]:
deck stateCode structureNumber substructure superstructure year yearBuilt Age
0 6 31 C000100305 5 7 1992 1935 57
1 7 31 C000100305P 7 7 1992 1935 57
2 5 31 C000100405 7 7 1992 1925 67
3 8 31 C000100505P 8 7 1992 1974 18
4 7 31 C000100905 7 7 1992 1962 30
5 7 31 C000100910 8 8 1992 1968 24
6 9 31 C000101004 9 9 1992 1985 7
7 9 31 C000101005 9 9 1992 1987 5
8 9 31 C000101010 9 9 1992 1986 6
9 6 31 C000101105 6 7 1992 1935 57
10 N 31 C000101203 N N 1992 1970 22
11 7 31 C000101205 7 7 1992 1935 57
12 8 31 C000101210 5 5 1992 1935 57
13 8 31 C000101215 8 7 1992 1935 57
14 6 31 C000101220 5 6 1992 1935 57
15 5 31 C000101305 2 3 1992 1935 57
16 6 31 C000101310 3 6 1992 1935 57
17 6 31 C000101315 6 4 1992 1935 57
18 8 31 C000101320 6 6 1992 1935 57
19 7 31 C000101405 7 7 1992 1935 57
20 6 31 C000101410 8 8 1992 1937 55
21 4 31 C000101415 6 5 1992 1935 57
22 7 31 C000101420 7 7 1992 1935 57
23 N 31 C000101425 N N 1992 1935 57
24 5 31 C000101430 6 7 1992 1935 57
25 5 31 C000101435 7 7 1992 1935 57
26 7 31 C000101505 6 7 1992 1974 18
27 8 31 C000101510 7 7 1992 1974 18
28 7 31 C000101605 4 7 1992 1935 57
29 9 31 C000101605P 7 7 1992 1935 57
... ... ... ... ... ... ... ... ...
400509 7 31 U1825J5105 7 7 2016 2002 14
400510 7 31 U1825J5505 7 7 2016 1998 18
400511 7 31 U1825K1435 7 7 2016 1994 22
400512 6 31 U1825K2205 7 7 2016 1990 26
400513 7 31 U1825K4510 7 7 2016 1935 81
400514 N 31 U1825M1205 N N 2016 1982 34
400515 5 31 U1825M5535 7 7 2016 1980 36
400516 7 31 U1825N3115 8 8 2016 2007 9
400517 7 31 U1825Q2005 7 7 2016 1998 18
400518 6 31 U1825Q3505 6 6 2016 1977 39
400519 7 31 U1825Q5505 7 7 2016 2000 16
400520 7 31 U1825Q5510 7 7 2016 1983 33
400521 7 31 U1825R1805 7 7 2016 2004 12
400522 8 31 U191500305P 9 9 2016 2009 7
400523 8 31 U191500610P 9 9 2016 2009 7
400524 8 31 U191500805P 7 8 2016 1991 25
400525 8 31 U191532910 9 9 2016 2009 7
400526 5 31 U220002010P 7 8 2016 1998 18
400527 8 31 U223004103P 8 8 2016 1996 20
400528 5 31 U223014310 5 4 2016 1936 80
400529 8 31 U223014410 8 8 2016 1996 20
400530 N 31 U223024505 N N 2016 2014 2
400531 8 31 U2230K4105 8 8 2016 1996 20
400532 5 31 U269502505P 5 6 2016 1970 46
400533 8 31 U269506086 9 8 2016 2004 12
400534 5 31 U269506204 6 5 2016 1938 78
400535 5 31 U269506208 6 5 2016 1938 78
400536 8 31 U269532830 8 8 2016 2003 13
400537 8 31 U2695E2305 7 8 2016 1982 34
400538 7 31 U2695L2605 8 8 2016 1983 33

400539 rows × 8 columns


In [42]:
def getData(state):
    pipeline = [{"$match":{"$and":[{"year":{"$gt":2015, "$lt":2017}},{"stateCode":state}]}},
            {"$project":{"_id":0,
                         "stateCode":1,
                         "structureNumber":1,
                         "yearBuilt":1,
                         "deck":1, ## rating of deck
                         "year":1, ## survey year
                         "substructure":1, ## rating of substructure
                         "superstructure":1, ## rating of superstructure
                         }}]
    
    dec = collection.aggregate(pipeline)
    conditionRatings = pd.DataFrame(list(dec))  
    conditionRatings['Age'] = conditionRatings['year'] - conditionRatings['yearBuilt']
    return conditionRatings

conditionRatings = getData('31')
conditionRatings = conditionRatings[['structureNumber','Age','superstructure','deck','substructure','year']]
conditionRatings = conditionRatings.loc[~conditionRatings['superstructure'].isin(['N','NA'])]
conditionRatings = conditionRatings.loc[~conditionRatings['substructure'].isin(['N','NA'])]
conditionRatings = conditionRatings.loc[~conditionRatings['deck'].isin(['N','NA'])]

In [51]:
## state, total bridges, PSD, PFD, PAD, Mean Deck, Mean Substructure, Mean Superstructure, Text 
dict_state_info = {}
stateNameDict = {'25':'MA',
                '04':'AZ',
                '08':'CO',
                '38':'ND',
                '09':'CT',
                '19':'IA',
                '26':'MI',
                '48':'TX',
                '35':'NM',
                '17':'IL',
                '51':'VA',
                '23':'ME',
                '16':'ID',
                '36':'NY',
                '56':'WY',
                '29':'MO',
                '39':'OH',
                '28':'MS',
                '11':'DC',
                '21':'KY',
                '18':'IN',
                '06':'CA',
                '47':'TN',
                '12':'FL',
                '24':'MD',
                '34':'NJ',
                '46':'SD',
                '13':'GA',
                '55':'WI',
                '30':'MT',
                '54':'WV',
                '15':'HI',
                '32':'NV',
                '37':'NC',
                '10':'DE',
                '33':'NH',
                '44':'RI',
                '50':'VT',
                '42':'PA',
                '05':'AR',
                '20':'KS',
                '45':'SC',
                '22':'LA',
                '40':'OK',
                '72':'PR',
                '41':'OR',
                '27':'MN',
                '53':'WA',
                '01':'AL',
                '31':'NE',
                '02':'AK',
                '49':'UT'
               }
stateAbbFull = {'MA':'Massachusetts',
                'AZ':'Arizona',
                'CO':'Colorado',
                'ND':'North Dakota',
                'CT':'Connecticut',
                'IA':'Iowa',
                'MI':'Michigan',
                'TX':'Texas',
                'NM':'New Mexico',
                'IL':'Illinois',
                'VA':'Virginia',
                'ME':'Maine',
                'ID':'Idaho',
                'NY':'New York',
                'WY':'Wyoming',
                'MO':'Missouri',
                'OH':'Ohio',
                'MS':'Mississip[i',
                'DC':'Washington DC',
                'KY':'Kentucky',
                'IN':'Indiana',
                'CA':'California',
                'TN':'Tennessee',
                'FL':'Florida',
                'MD':'Maryland',
                'NJ':'New Jersey',
                'SD':'South Dakota',
                'GA':'Georgia',
                'WI':'Wisconsin',
                'MT':'Montana',
                'WV':'West Virginia',
                'HI':'Hawaii',
                'NV':'Nevada',
                'NC':'North Carolina',
                'DE':'Delaware',
                'NH':'New Hampshire ',
                'RI':'Rhode Island',
                'VT':'Vermont',
                'PA':'Pennsylvania',
                'AR':'Arkansas',
                'KS':'Kansas',
                'SC':'South Carolina',
                'LA':'Louisiana',
                'OK':'Oklahoma',
                'PR':'Puerto Rico',
                'OR':'Oregon',
                'MN':'Minnesota',
                'WA':'Washington',
                'AL':'Alabama',
                'NE':'Nebraska',
                'AK':'Alaska',
                'UT':'Utah'
               }

    
code = []
name = []
nob  = []
sup  = []
sub  = []
deck = []
for state in states:
    conditionRatings = getData(state)
    conditionRatings = conditionRatings[['structureNumber','Age','superstructure','deck','substructure','year']]
    conditionRatings = conditionRatings.loc[~conditionRatings['superstructure'].isin(['N','NA'])]
    conditionRatings = conditionRatings.loc[~conditionRatings['substructure'].isin(['N','NA'])]
    conditionRatings = conditionRatings.loc[~conditionRatings['deck'].isin(['N','NA'])]
    code.append(stateNameDict[state])
    name.append(stateAbbFull[stateNameDict[state]])
    nob.append(len(conditionRatings))
    
    sub.append(np.mean([int(j) for j in conditionRatings['substructure']]))
    sup.append(np.mean([int(j) for j in conditionRatings['superstructure']]))
    deck.append(np.mean([int(j) for j in conditionRatings['deck']]))

dict_state_info['code'] = code
dict_state_info['name'] = name
dict_state_info['number of bridges'] = nob
dict_state_info['mean substructure'] = sub
dict_state_info['mean supertructure'] = sup
dict_state_info['mean deck'] = deck

In [52]:
dict_state_info


Out[52]:
{'code': ['NE',
  'IA',
  'IL',
  'IN',
  'KS',
  'MI',
  'MN',
  'MO',
  'ND',
  'SD',
  'OH',
  'WI'],
 'mean deck': [6.4123968363136177,
  6.2119161827821259,
  6.7837042486665444,
  6.4582940066068897,
  6.636711738662199,
  6.5369729729729729,
  6.8090346371658104,
  6.8155807805939981,
  6.6432370820668689,
  6.0343511450381682,
  7.0851791345313631,
  6.6300906842539158],
 'mean substructure': [6.6807083906464921,
  6.0281747033577382,
  6.9478112929924594,
  6.5817012741859369,
  6.5953093215268641,
  6.5870270270270268,
  6.7992888186487557,
  6.8427408904784119,
  6.3351063829787231,
  5.8547373147732378,
  7.0844952699365527,
  6.8286892003297606],
 'mean supertructure': [6.9000171939477308,
  6.3472860388790711,
  6.7378149714916313,
  6.4867272298253891,
  6.7351322244824701,
  6.5248648648648651,
  6.8987225075727645,
  6.8542994868605192,
  6.6812310030395139,
  6.1493039964077232,
  7.116788875802591,
  6.8131079967023904],
 'name': ['Nebraska',
  'IOWA',
  'Illinois',
  'Indiana',
  'Kansas',
  'Michigan',
  'Minnesota',
  'Missouri',
  'North Dakota',
  'South Dakota',
  'Ohio',
  'Wisconsin'],
 'number of bridges': [11632,
  19805,
  21748,
  16952,
  16714,
  9250,
  7593,
  19293,
  2632,
  4454,
  52642,
  12130]}

In [53]:
pd.DataFrame(dict_state_info)


Out[53]:
code mean deck mean substructure mean supertructure name number of bridges
0 NE 6.412397 6.680708 6.900017 Nebraska 11632
1 IA 6.211916 6.028175 6.347286 IOWA 19805
2 IL 6.783704 6.947811 6.737815 Illinois 21748
3 IN 6.458294 6.581701 6.486727 Indiana 16952
4 KS 6.636712 6.595309 6.735132 Kansas 16714
5 MI 6.536973 6.587027 6.524865 Michigan 9250
6 MN 6.809035 6.799289 6.898723 Minnesota 7593
7 MO 6.815581 6.842741 6.854299 Missouri 19293
8 ND 6.643237 6.335106 6.681231 North Dakota 2632
9 SD 6.034351 5.854737 6.149304 South Dakota 4454
10 OH 7.085179 7.084495 7.116789 Ohio 52642
11 WI 6.630091 6.828689 6.813108 Wisconsin 12130

In [ ]: